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INTRODUCTION 

A  major  concern  of  my  research  group  over  the  last  twenty  years  has  been  the  development 
of  quantitative  quantum  mechanical  procedures  that  chemists,  including  ourselves,  could  use  as  a 
practical  aid  in  studies  of  chemical  problems,  in  the  same  kind  of  way  as  NMR  or  mass  spec¬ 
trometry.  To  be  useful  in  this  connection,  a  procedure  must  reproduce  the  energies  and  other  rele¬ 
vant  properties  of  molecules  with  "chemical"  accuracy,  and  it  must  also  be  applicable  to  the 
molecules  in  which  chemists  are  specifically  interested,  which  are  often  quite  large,  at  reasonable 
cost,  using  readily  available  computers. 

It  seemed  clear  to  me  from  the  start  that  current  so-called  ab  initio  methods  would  not  be  able 
to  meet  these  conditions  in  the  foreseeable  future  and  this  indeed  is  still  the  case  today  [1].  We 
therefore  adopted  an  expedient  which  has  proved  succesful  in  many  other  areas  of  theoretical 
chemistry  where  exact  solutions  of  key  equations  are  unavailable  or  too  expensive,  i.e.  taking  a 
crude  (and  hence  cheap)  approximation  and  then  trying  to  upgrade  its  accuracy  by  introducing 
adjustable  parameters.  The  Debye-Huckel  theory  of  strong  electrolytes  is  a  classic  example. 

This  so-called  semiempirical  approach  to  quantum  chemistry  was  not  in  itself  novel.  Indeed, 
the  approximations  we  have  used  were  first  introduced  by  Pople.  However,  previous  attempts  to 
parametrize  them  had  failed  to  give  satisfactory  results  and  it  had  become  generally  accepted  that 
procedures  of  this  kind  could  never  be  of  any  real  value  in  chemistry.  We  were  able  to  show  that 
this  failure  was  due  simply  to  lack  of  effort  [2],  Over  the  years  we  have  been  able  to  develop  a 
series  of  successively  better  treatments,  based  on  Pople’s  INDO  and  MNDO  approximations,  and 
the  three  latest,  MINDO/3  [3],  MNDO  [4],  and  AMI  [5],  are  now  being  widely  used.  Our  own 
studies  of  a  wide  range  of  problems  concerning  chemical  behaviour,  including  the  mechanisms  of 
numerous  reactions,  have  led  in  many  cases  to  major  revisions  of  conclusions  that  had  become 
embedded  in  chemical  theory  [1,6,7]. 

The  success  of  these  treatments  is  admittedly  astonishing,  given  the  extreme  crudity  of  the 
approximations  on  which  they  are  based.  Many  theoreticians  have  indeed  rejected  our  claims  as 
mathematically  impossible,  an  attitude  which  be  wholly  justifiable  if  our  treatments  had  been  put 
forward  as  approximations  to  the  Schrddinger  equation.  This,  however,  is  not  the  case.  As  we 
have  repeatedly  pointed  out,  our  object  has  been  quite  different.  We  have  been  trying  to  develop 
effective  molecular  models  (6a, 6d].  A  model  is  a  simple  device  that  mimics  the  behaviour  of  a 
more  complex  one.  We  may  then  be  able  to  predict  the  behaviour  of  a  device  that  is  too  complex 
for  rigorous  analysis  by  observing  the  behaviour  of  an  effective  model.  The  value  of  a  model  in 
this  connection  depends  only  on  how  well  it  mimics  the  behaviour  of  the  parent  system,  not  on 
direct  comparisons  of  the  two.  One  cannot,  for  example,  dismiss  a  model  on  the  grounds  that  it  is 
"made  of  cheap  plastic".  Assessment  of  our  treatments  must  likewise  be  based  solely  on  how 
well  they  perform  in  practice,  not  on  the  accuracy  of  the  approximations  on  which  they  are  based. 

In  any  MO  treatment  of  a  molecule,  each  of  the  terms  in  the  expression  for  the  total  energy 
can  be  given  a  physical  interpretation,  corresponding  to  the  kinetic  energy  of  one  of  the  particles 
1  (electrons  or  nuclei)  involved  or  to  the  electrostatic  interaction  between  two  of  them.  Errors  in 
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the  energies  calculated  for  a  number  of  different  molecules  can  then  usually  be  attributed  to  er¬ 
rors  in  specific  terms.  In  our  approach,  where  some  or  all  of  these  terms  are  replaced  by  paramet¬ 
ric  expressions,  the  errors  can  then  be  corrected,  or  at  least  reduced,  by  appropriate  adjustment  of 
the  corresponding  parameters. 

The  extent  to  which  the  inherent  deficiencies  of  a  given  MO  treatment  can  be  countered  by 
parametrization  is  naturally  limited.  One  cannot  for  example  hope  to  eliminate  errors  that  are  di¬ 
rectly  due  to  the  basic  approximations  made  in  the  treatment,  for  example  the  neglect  of  overlap 
in  treatments  such  as  INDO  or  NDDO.  Semiempirical  treatments  should  therefore  be  based  on 
the  best  possible  approximations.  Here,  however,  limits  are  set  by  the  performance  of  currently 
asavailable  computers.  Since  our  object  is  to  predict  the  behaviour  of  molecules  of  interest  to 
chemists, which  are  commonly  quite  large,  we  must  be  able  to  carry  out  calculations  for  the  corre¬ 
sponding  models. 

Parametrization  involves  finding  a  minimum  in  the  parameter  hypersurface ,  the  multidimen¬ 
sional  hypersurface  that  represents  the  RMS  error  in  the  properties  calculated  for  the  basis  set 
molecules  as  a  function  of  the  parameters.  The  parameters  are  then  tested  by  carrying  out  calcula¬ 
tions  for  a  selected  set  of  molecules  for  which  experimental  data  are  available,  including  as  many 
different  kinds  of  molecule  as  possible.  This  test  involves  a  subjec'ive  assessment  of  the  signifi¬ 
cance  of  the  errors  for  specific  molecules.  If  the  results  for  one  or  more  key  molecules  are  unsat¬ 
isfactory,  the  parametrization  is  repeated  with  increased  weight  being  given  to  molecules  of  that 
kind.  The  situation  is  further  complicated  by  the  fact  that  parameter  hypersurfaces  commoniv 
have  numerous  minima.  There  is  no  way  to  tell  whether  a  given  minimum  is.  or  is  not,  the  global 
minimum.  Furthermore,  there  is  no  guarantee  that  the  global  minimum  for  a  given  basis  set  will 
correspond  to  the  "best"  set  of  parameters  for  the  atoms  in  question  because  it  may  lead  to  chemi¬ 
cally  unacceptable  errors  in  the  results  for  specific  molecules.  In  short,  parametrizing  one  of  our 
semiempirical  procedures  is  an  extremely  laborious  and  very  frustrating  undertaking  which 
moreover  requires  chemical  knowledge  and  judgement  as  well  as  perseverence. 

Our  two  latest  treatments,  MNDO  and  AMI,  are  based  on  Pople's  NDDO  approximation, 
further  simplified  to  save  computing  time.  These  additions  include  the  core  approximation,  es¬ 
timation  of  the  electron  repulsion  integrals  from  a  simple  parametric  function  suggested  by  De- 
war  and  Sabelli,  and  allowance  for  electron  correlation  by  the  Pariser-Parr  [8]  expedient  of 
adjusting  the  electron  repulsion  integrals  (EE). 

MNDO  suffered  from  a  systematic  overestimation  of  the  repulsions  between  nonbonded 
atoms  which  led  to  an  overestimation  of  steric  effects  and  intermoiecular  repulsions,  leading  in 
particular  to  failure  to  reproduce  hydrogen  bonds.  When  attempts  to  correct  these  errors  by  repa- 
rametrization  failed,  we  tried  modifying  the  core  repulsion  function  by  adding  terms  containing 
additional  parameters.  This  expedient  led  to  an  improved  treatment  (AMI)  in  which  the  system¬ 
atic  errors  in  MNDO  seemed  at  fust  to  have  been  overcome.  Extensive  use  of  AMI  has,  howev¬ 
er,  shown  that  although  the  situation  in  AMI  is  better,  the  errors  in  question  have  not  been 
eliminated.  In  particular,  although  AMI  reproduces  the  heats  of  formation  of  hydrogen  bonds 
reasonably  well,  the  predicted  geometries  are  unsatisfactory. 

We  have  now  discovered  the  real  cause  of  the  error.  The  parametric  function  used  to  calculate 
the  electron-electron  repulsions  (EE)  in  MNDO  and  AMI  does  not  have  the  proper  dependence 
on  intemuclear  distance,  the  calculated  repulsions  between  electrons  on  two  different  atoms  be¬ 
ing  too  small  at  distances  significantly  greater  than  the  normal  bond  length.  Since  the  core- 
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electron  attractions  are  equated  to  a  sum  of  EE-type  terms,  they  too  are  underestimated,  and  since 
the  interactions  between  two  atoms  involve  one  set  of  electron-electron  repulsions  but  wo  sets  of 
core-electron  attractions,  the  error  in  the  EE  integrals  leads  to  a  underestimation  of  the  attractions 
between  the  atoms  and  hence  to  a  total  energy  that  is  too  positive.  The  error  in  MNDO  was  there¬ 
fore  due  to  underestimation  of  the  electron-electron  repulsions,  which,  in  the  MNDO  formalism, 
are  unrelated  to  the  core  repulsions.  Any  improvement  brought  about  by  reparametrization,  or  by 
changes  in  the  CR  function,  should  therefore  be  achieved  only  at  the  expense  of  compensating 
errors  elsewhere,  the  latter  being  greater,  the  more  effectively  the  "MNDO  errors"  are  dealt  with 
Since  the  parametrization  of  AMI  was  designed  to  optimize  its  overall  performance,  the  MNDO 
errors  were  reduced  but  not  completely  eliminated. 

This  conclusion  is  supported  by  Stewart's  PM3  procedure  [9],  a  reparametrization  of  AMI  di¬ 
rected  to  eliminating  these  errors  completely.  It  now  seems  to  be  generally  agreed  that  any  gains 
achieved  in  this  wav  are  counterbalanced  hv  more  serious  losses  elsewhere. 

This  problem  might  be  overcome  by  modifying  the  empirical  EE  function  in  AMI .  However, 
any  such  modification  might  well  lead  to  other  problems.  A  better  solution  would  be  to  attack  the 
problem  at  its  root  by  using  theoretical  values  for  the  EE  integrals,  in  other  words,  by  replacing 
the  basic  AMI  approximation  by  a  better  one  in  which  the  simplification  leading  to  a  specific 
type  of  error  is  avoided.  This  approach  would  have  the  further  advantage  of  allowing  easy  exten¬ 
sion  to  spd  basis  sets,  the  AMI  formalism  becoming  very  cumbersome  if  d  AOs  are  included. 

Work  on  this  new  treatment,  which  we  termed  SAMI  (Semi- Ab-initio  Model  1)  began  some 
years  ago  at  the  University  of  Texas,  in  Austin.  A  corresponding  computer  program  was  devel¬ 
oped  and  included  in  a  version  AMP  AC  and  a  stsrt  was  made  on  parametrizing  it  for  the  "orga¬ 
nic"  elements  (C,H,0,N).  The  work  carried  out  with  support  from  the  above  AFOSR  grant  has 
been  an  extension  of  this  work. 


CURRENT  STATUS  OF  SAMI 

SAMI  follows  the  same  basic  pattern  as  AMI,  being  likewise  based  on  the  NDDO  approx¬ 
imation  together  with  the  core  approximation,  and  the  one-center  integrals  are  likewise  treated  as 
paraameters.  The  electron  repulsion  integrals  are  calculated  theoretically,  using  the  STO-3G  ba¬ 
sis  set,  and  scaled  by  a  measure  of  orbital  overlap  to  allow  for  electron  correlation,  following  the 
idea  pioneered  by  Pariser  and  Parr  [8],  The  field  due  to  the  core  of  atom  m  is  equated  to  minus 
that  generated  by  Zm  valence  shell  s  electrons,  Zm  being  the  core  charge  in  units  of  the  electronic 
charge.  Thus  the  attraction  between  the  core  of  atom  m  and  an  electron  in  the  AO  <$>„  of  atom  n  is 
set  equal  to  -Zm(s„sm;<f),<t>J  where  is  the  electron  repulsion  integral  between  the  s  AO 

of  atom  m  and  the  AO  of  atom  n. 

Work  on  SAMI  began  three  years  ago  in  Austin,  at  the  University  of  Texas,  before  l  moved 
to  Florida.  We  wrote  a  basic  computer  program  to  carry'  out  SAM  I  calculations,  which  was  in¬ 
corporated  in  the  current  version  of  our  AMPAC  and  parametrization  programs,  and  preliminary 
parametrizations  established  what  seemed  to  be  a  suitable  form  for  the  Pariser-Parr  scaling  func¬ 
tion.  The  move  to  the  University  of  Florida  naturally  caused  disruption,  particulary  since  it  in¬ 
volved  transferring  our  programs  from  the  Alliant  FX8  computer  we  had  in  Austin  to  the  two 
SUN  work  stations,  and  an  IBM  RSC  6000  workstation,  which  we  acquired  in  Gainesville.  It  is 
interesting  to  note  that  these  together  have  provided  us  with  at  least  ten  times  more  computing 
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time  than  the  Alliant  while  costing,  collectively,  less  than  one-fifth  as  much.  A  brief  summary  of 
the  status  of  SAMI  at  the  end  of  1991follows. 

A.  Computer  Programs.  The  original  SAMI  program  was  a  makeshift  affair  in  which  the 
integrals  were  taken  from  a  standard  package  and  in  which  geometry'  optimization  was  carried 
out  using  derivatives  found  by  finite  difference.  It  has  now  been  completely  rewritten  and  opti¬ 
mized  and  geometry  optimization  is  now  carried  out  using  analytical  derivatives. 

B.  Parametrization.  Preliminary  studies  confirmed  the  efficacy  of  the  SAMI  algorithm  es¬ 
tablished  in  Austin,  in  particular  the  scaling  function  used  to  allow  for  electron  correlation.  Pa¬ 
rameters  for  carbon  and  hydrogen  were  obtained  which  gave  results  as  good  as  those  from  AMI . 
using  the  unmodified  MNDO  CR  function.  While  we  later  included  Gaussian  terms  to  improve 
the  calculated  geometries,  the  number  of  parameters  for  C  and  H  is  still  less  in  SAMI  than  in 
AMI. 

Problems  arose  in  the  case  of  nitrogen  and  oxygen,  particularly  for  compounds  containing 
NN  bonds  for  which  the  calculated  heats  of  formation  were  too  negative.  Repeated  attempts  to 
improve  the  results  led  to  increased  errors  elsewhere.  We  are  now'  sure  that  the  error  is  due  to  to 
another  basic  simplification  in  the  SAMI  formalism,  namely  use  of  the  ZDO  approximation.  As 
is  well  known,  this  leads  to  an  underestimation  of  the  exchange  repulsions  between  filled  orbitals 
and  the  errors  are  particularly  large  for  filled  AOs,  i.e.  for  lone  pairs.  We  have  therefore  aban¬ 
doned  attempts  to  improve  the  situation  by  further  modifeation  of  the  parameters. 

C.  Results.  SAMI  has  been  tested  by  carrying  out  calculations  for  an  extensive  set  of  mo¬ 
lecular  species  for  which  apparently  reliable  experimental  data  are  available.  Tables  1  compares 
the  results  for  organic  (CHON)  molecules,  given  by  SAMI,  AMI ,  and  PM3.  The  quantities  listed 
are  mean  unsigned  (MU)  errors,  root  mean  square  (RMS)  errors,  and  mean  signed  (MS)  in  the 
calculated  heats  of  formation. 


TABLE  1.  NEAN  ERRORS  (KCAL/MOL)  IN  CHEATS  OF  FORMATION 


Number  of  Molecules 

Type  of  Error 

SAMI 

Procedure 

AMI 

PM3 

217 

MU 

4.16 

7.34 

4.44 

RMS 

5.66 

10.30 

6.07 

MS 

-0.17 

0.56 

-0.27 

Better  results  are  expected  when  parametrization  of  SAMI  is  finalized. 


Work  on  this  project  is  being  continued  with  further  support  from  AFOSR. 
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